random 0
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Hong Kong (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- (10 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Data Science > Data Mining (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Data Science > Data Mining (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
A Proofs
Evaluation The evaluation methods we used are summarized in Algorithm 2 and Algorithm 3. We summarize the hyperparameters used for our evaluations in Table 3. T able 3: Hyperparameters used for evaluations. B.2 Implementation details for BPC-rKL To obtain a Bayesian pseudocoreset with reverse KL divergence by Algorithm 1 in [19], we need to Require: Differentiable augmentation function A (Optional). Figure 1b, we show the accuracy with varying variances. 's are presented as colors. Table 5 shows additional results for the CIFAR10 dataset when the pseudocoreset size is larger. Even in these cases, BPC-W and BPC-fKL effectively generate Bayesian pseudocoresets.
Cohort-Based Active Modality Acquisition
Rheude, Tillmann, Eils, Roland, Wild, Benjamin
Real-world machine learning applications often involve data from multiple modalities that must be integrated effectively to make robust predictions. However, in many practical settings, not all modalities are available for every sample, and acquiring additional modalities can be costly. This raises the question: which samples should be prioritized for additional modality acquisition when resources are limited? While prior work has explored individual-level acquisition strategies and training-time active learning paradigms, test-time and cohort-based acquisition remain underexplored. We introduce Cohort-based Active Modality Acquisition (CAMA), a novel test-time setting to formalize the challenge of selecting which samples should receive additional modalities. We derive acquisition strategies that leverage a combination of generative imputation and discriminative modeling to estimate the expected benefit of acquiring missing modalities based on common evaluation metrics. We also introduce upper-bound heuristics that provide performance ceilings to benchmark acquisition strategies. Experiments on multimodal datasets with up to 15 modalities demonstrate that our proposed imputation-based strategies can more effectively guide the acquisition of additional modalities for selected samples compared with methods relying solely on unimodal information, entropy-based guidance, or random selection. We showcase the real-world relevance and scalability of our method by demonstrating its ability to effectively guide the costly acquisition of proteomics data for disease prediction in a large prospective cohort, the UK Biobank (UKBB). Our work provides an effective approach for optimizing modality acquisition at the cohort level, enabling more effective use of resources in constrained settings.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- Europe > Austria > Vienna (0.14)
- (15 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.45)
- North America > United States > Minnesota (0.04)
- Europe > United Kingdom > Scotland > Scottish Borders (0.04)
- Europe > United Kingdom > Scotland > Dumfries and Galloway (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Overview (0.68)
A Appendix
A.1 Notations Table 4 summarize the notations used in this paper. Let n be the size of set A . Hence, the above formulation of set function is submodular and is an instance of concave over modular function. Table 8 shows the training times for this setting. In contrast, the GM function selects representative samples from the source domain.
- Transportation (0.68)
- Health & Medicine > Therapeutic Area > Dermatology (0.46)
Learning to See by Looking at Noise - Supplementary Material
Dead leaves - Textures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . For the experiments in Section 4.1, we follow the training procedure described in [ The dimensionality of the last and the penultimate embedding are 128 and 4096 respectively. For the experiments in Section 4.2, we follow the training procedure described in [ In bold, best synthetic dataset, underlined when it also outperforms training with real images. From left to right the columns correspond to the tasks: EuroSA T, Resisc45, Diabetic Retinopathy and Patch Camelyon. From left to right the columns correspond to the tasks: Clevr-Closest Object Distance, Clevr-Count, dSprites-Orientation, dSprites-Label X-position, SmallNORB-Elevation, sNORB-Azimuth, DMLab and KITTI-Closest V ehicle Distance.
UP-DP: Unsupervised Prompt Learning for Data Pre-Selection with Vision-Language Models
In this work, we focus on a new and practical task of data pre-selection for data-efficient visual object recognition (Fig.1-a). The goal of data pre-selection is to select instances for labeling from an unlabeled dataset through a single pass to maximize model performance for unknown downstream vision tasks (e.g., no knowledge about